DOCUMENT HESimE 



ED 237 578 

AUTHOR 
TITLE 



SPONS AGENCY 

PUB DATE 

GRANT 

NOTE 

PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTlFIEnS 



TM 830 848 

Baker, Linda 

Spontaneoui versus\ Instructed Use of Hultiple 
Standards for Evaluating Comprahension i Effects 
Age, Reading Proficiency, and Type of Standard, 
national Inet, of Education (ED), Washington, DC. 
[83] 

NIE--G-81^0100 
41p. 

Reports - Rartarch/Technical (143) 
HF01/PC02 Plus Postage, 

Age Differences; Cognitive Style; *Evaluation 
Criteria; ^Grade 4; Grade 5; Intermediate Grades; 
*Prose; Reading Ability; *Reading Comprehension; 
*Raading Strategies, *Self Evaluation 
(individuals) 

^Comprehension Monitoring; Embedding (Granuiiar); 
*Embadding Transformations 



of 



ABSTRACT 

. Fourth and sixth grade children differing in reading 

prof icisncy read and commented, on brief expository passages 
containing three different types of embedded problems (nonsense 
words, prior knowledge violations, and internal inconsistencies ) * 
Half of the children were spaci f iclally instructed as to the typas of 
standards thay should apply in ordar to datect the problems (lexical 
external consistency, and internal consistancy ) ; the remaining 
children were simply instructed to look for problems. Both 
quantitative and qualitativa dijfarence^ in standard use were 
revealed by the children's Comments about all parts of the passages 
Older and better readers used more different standards and they used 
them more frequently than younger and poorer readers. The lexical 
standard was more likely to be adopted spontaneously than the other 
two. standards and it was the only standard used by a substantial 
proportion of both younger and poorer readers. The results 
demonstrate that , children differ in their ability to decide for 
themselves whether or not they understand but that their performance 
depends in part on the amount of guidance they are given. (Author) 



* Reproductions supplied by EDRS are the best that can be made * 

from the original document. * 



Spontaneous Versus Inatrucfced Use of Multiple Standards 
for Evaluating Corapreheneioni Effects of Age, 
Reading Proficiency, and Type of Sti^ndard 



Linda Baker 
University of Maryland Baltimore County 



U.S. PiPARTMENT OF EPi|£ATI@M 
NATIONAL iNSTITUTt Of EDUCATION 

IPUCATIONAL RESOURCES iNFORMATlOil 

CtNTiR IIRIC) 
1^ This deeumant has tesn fsprsdu^e^ ii 
receiy*^ frsm the pefson 5r Organi^aiisi 

LJ Mlnef ehapges have been made ts impm 
fBpfodiiiciiQn quilitV' 



* Psints view or opinions gtatftd in ihif doty 
meni do net neeesainly represent oHieisSJjlE 
psiiiion or pglisV' 



Running head; Standards of Evaluation 



ERIC 



standards of ^luatlon 



Abstract 

Fourth and sixth grade children differing in reading profiQieney read and 
commented on brief axpoaitory pasaages containing Waree different types of 
embedded problems (nonsense words, prior knowledge violations, and internal ' 
InconsistenQies) . Half of the ehirdren were speQifioally instructed as to thi 
types of standards they should apply in order to deteat the problems (lexical ^ 
externai GonsistenGy. and internal Gonslstency) ; the remaining children were 
simply instructed to look for probleras. Both quantitative and qualitative 
differences in standard use were revealed by the ehildrfen's aomments about all 
parts of the passages. Older artd better readers used more, different standards 
and they uaed them more frequently than younger and poorer readers. The 
lexical standard was more likely to be adopted spontaneously than the other 
two standards and it was the only Standard used by a substantial proper tion' of 
both younger and poorer readers. The results demonstrate that children differ 
in their ability to decide for themselves whether or not they understand but 
that their performunQe depends in part on the amount of guidanae they are 
given* 



\ 



'Standafds of Evaluation 



A domfflon paradigm in researeh 'on coraprehenaion monitoring hfs been to 
examine children's ability to identify a partieular type of problem embedded 
In a text (e.g-n Garner, 1980, Garner & Kraus, 1981-82S Harris, Krulthof, 
Terwogt, 4 Visser, 1981; Marxian 1979)- A diffioulty with this approach is 
jbhat it has fostered a aonoeption of comprehension monitoring as a unitary 
phenomenon rather than as a multidimensional prooess. Consider* for example, 
a .typical study in which children are tested for their detection of 
contradictory information embedded within a passage. If the qhildren fail to 
notice the inconsistencies, the researcher concludes that the children were 
poor at monitoring their comprehension. What the study in faGt has shown, 
however, is that the children were poor at evaluating their understanding with 
respect to internal consistency. In other words, the study reveals a 
difficulty in using one particular standard of evaluaWon; it s^ys no thing 
about the use of other standards* 

The purpose of this paper is to present evidence for a richer 
conceptualiMtion of coraprehensio'n monitoring. Just as we have come to 
Realize that comprehension is a complex multi-faceted process (of. Spiro, 
Bruce. & Brewer, 19803, so. too must we realise that effective evaluetion of 
that .CDmprehension is multidimensional (ef. Baker ^ in press; Mar Ifcraan, 1981) . 
There are many different standards or eriterim that readers must ^take into- 
aocount when they^ decide whether or not they understand, a fact that much of 
the existing research has failed to recognise. Moreover, there may be 
differences in the likelihood that particular standards will be applied, 
differences which may account for some of the apparent contradictions in the 
literature. For example, one study may show that ten year olds are 



Standards of Eveluation 
■ 2 ' , 



ERIC 



exeeptionally good at noticing embedded nonsense words while anotheir may_ show . 
that ten year olds are very poor at noticing ineonsistencies. In both 
studies, general coneluslons are^ drawn about comprehension monitoring as 
unitary ph'enomenon; In one case comprehension monitoring is eonoluded to be 
good, in the other poor. The disprepanoy can be better understood when we 
consider that two different rJ ^ndards of eyaluatton were required. 

The study to be reported ■ „amined children's use of multiple standards of 
evaluation while reading estposltory text. Three standards were selected for 
iTivestlgation: -lexical internal eonsisteney, and external conslstenGy.. " 
The lexical standard involves consideration of individual ward meanings and 
can be applied without regard to surrounding context. The internal 
eonsisteney standard involves evaluation of the consistency of different 
propositions within the text. The external consistency standard involves 
evaluation of the consistency of a proposition In the text ^ith respect to' 
prior knowledge. The standards obviously differ In their processing demands 
..and so are likely to differ in their ease of application. For example, use of 
the internal consistency «tandard entails several steps: aoeesslng a memory 
representation if the propositions are widely separa^led. Integrating the 
propositions, and ooraparlng them. Use of the lexical standard, in contrast, 
requires only a check that the word is present in. one's internal lexicon. 

Problems were erabedded within the passages to ensure that there would be 
at least some occasions when each type of standard could be used. 
'Accord ingly, some passages contained nonsense words, "soine contained internal 
ineonslstenoies and some contained prior krtowledge violations. But there was 
no assumption that these problems were the only aspects of the passages that 



y 



' • ' ■ ■ - Standards of , Evaluation 

■ • - ' . .. 3 ■ ■ 

« » - ' ■ * , 

it bfc n probl«matiG by individual children* Although problera 
rit^^. ion erved as one dependant raeasuref of more interest were the 
tyj^s Qi 5^ jrdB iiub^feGts liaed and their patterns of use. Therefore* all of 
fnm-" , made about the en tire passages were eoded as to the type of 
U u .hey reflected and these data provided the basis for sevagal 
tlor al deperiident measures.* 

" Ttieipants In the study were fourth and sixth graders, identified as 
either better or poorer readers on the basis of standardized test scores. The 
ahirdren's task was ^to read eaeh passage silently, underlining anything that 
appeared problematic and then explaining why they had^'^done so. Half of the 
ehildren were specifieally told that three different types of, problems would 
be present in .the passages and they were given examples of eaeh type. The 
remaining children were simply informed that problems would h% present* The 
speaifie instruction condition was designed to provide information about the 
use of the different standards when subjects are Induced to use them* Ilie ^ 
general instruction condition was designed to reveal what types of standards 
the rsubjects spontaneously adofct when given instructions to evaluate their 
understanding carefully. Note that this condition does not correspond to an 
uninformed control group: ^ere is too much evidence that readers frequently 
will not identify embedded problems if they are not told that problems are 
jpresent, in line with Grices's (1975) GOoperative principle* ' 

It was expected that the specificity of the instructions would have a . 
strong effect on overall standard use, consistent with previous evidence that 
ehildren identify more problems when explicitly told what to look for (e.g*, 
Baker, Note 1; Markman h Gorln, 1981). Of particular interest was whether the 



Standarda of Evaluation 



instrUGtion manipulation would have dlfferAntial effeots depending on age, ' 
reading level, and type of standard^ For example, perhaps older and/or better 
readers spontaneously adopt several different standards for evaluating their 
Qomprehansion. If so, there should be little dlfferenoe between general and ^ ' 
specific instrUQtions for the more mature readers. Alternatively, perhaps the 
poorer readers will have difficulty applying the standards when they are told 
to do so, so only the better readers will benefit frpm spesifie instruations. 
It is also possible that eertain standards are more likery to be adopted 
spontaneously than others. A likely eandidate 4s the lexioal standard 
which* because it is one J^hat children are frequently exhorted to appljjs ma^ 
be used equally often by children receiving general and specif ic -ins true tions* 
Subjec ts 

' e 

A total of 108 children participated in the study* The children were 
enrolled in the fourth and sixth grades of three suburban public schools. 
There were 5^ fourth graders (29 girls), with a mean age of 9.i|6 yeari 
(s^d,M.37). mere were 54 sixth graders (27 girls), with a mean age of 11,51 
years (s,d,^.40)^ Candidates for participation in the study were pre-sel.eeted 
on the basis of their reading scores on the California Achievement Test, 
Children were selected for the better reading group if they had stanlne scores 
of 7, 8, or 9 and for the poprer reading group if they had stanine scores of 
3t or (Although a score of 5 is normatively average b^sed on tjie entire 
U.S, sample, within this particular eountyi' the mean was 6,5,) The final 
sample consisted of 31 better and 23 poorer'^ four th grade readers (mean stanlne 
scores - 7,77 and 4,09, s,d. ,76 and .75, respectively), and 32 better and 
22 poorer sixth grade readers (mean stanine scores ^7,38 and 4,26, s,d, ^ ,49 



- : Standards^ of Evaluation 

and ,70^ respectively)* The two .sexes^ere evenly represented within each 
.group* The subjects within groups were randomly assigned to one of two 
instructional conditions, with the' oonstraint that there be equal numbers of 
boys and girJS in each Gpndition* (Note that gender was not traated as a 
factor in this study because the pumber of subjects per cell would have been 
too small*) 

... ■ V 

Materials " s 

The first step in developing the passages was to consult thr ee .non--f iction 
books written for elementary school children. One . book daa3 1 with the 
weather, a second with the planets, and the third with geographical regions in 
the United States, thm books served as the source of information for the 
passages, but the passages bore little ata^uctural similarity to the books. 
Five passages were constructed dealing with each topic, passages were 

very similar in overall length (mean number of words = ^9*25* s.d, = 1.^2, 
range ^ ^7-51) and all consii ted of six sentences* Across passages, the 
length of each sentence occupying the same serial position was the same <i,e,, 
sentencie number cne had eight Words, sentence 2 had 10 w^rds, etc). The 
opening sentence of each passage introduced the topic and subsequent senteroes 
provided relevant facts about the topic. The number of propositions was 
comparable across passage^s (mean = 11.96* s.d. ^ 1.16), The passag^es were 
checked for readability using the Harris-Jacobsen formula (Harris & Slpay, 
1930). This formula, which takes both sentence length and vocabulary 
difficulty into account, yielded a mean readability level ^f 3»22 (s.d, ^ -13 t 
range m 3,03 to 3.^3) • ^ " • 



8 



Standards of Evaluation 

' . ■ ' 6 



After the passages were written, 12 of the passages were mad% problematip, 
by replaeing^ a single word with another. Four of the modified passages 

'* '■ s ■ , ft' 

contained two-^syllable nonaenae" words which followed standard rules of English 



I nons^enae woras which followed si 
?le ox a nonsense word appears In 



orthograpfty. An e^araple^f a nonsense word appears In the following sentenaet 
"It is so ho w that, most brugens would melt there," Four other passages 
presented information that viol a ted common world knowledge . An sample is: 
"They used sand from the trees to make many things*" ; The third type of 
problem^ embedded in the remaining four passages, was internal inoonsistency , 
area ted by making one sentence in the passage conflict with a previously 
presented sentence* One sentence intervened between these two target 

sentences. An example follows* "The temperature on Venus '^is much higher than 

- ■ ^ 

boiling water, ' Venus is about the same si^e as Earth, But it is much too 
cold for us to live there," ftie nonsense words and the prior knowledge 
violations were always embedded within the middle of the fourth sentence and 
they always replaced nouns* Hie internal inaonsisteneies always involved 
information contained in sentences two and four r the substitution appeared in 
the middle of sentence ^ and was always an adjective* 

In order to ensure that the problems would be pefceived as such by adult 
readers* the passges were first presented to 20 undergraduates who. were asked = 
to identify the problems. They were explicitly informed that three different 
types of problems would be present in the passages and were given examples of 
each type. Subjects correctly Identified 961 of the nonsense words, 871 of 
the inconsistencies, and 85$ of the prior knowledge violations. In a few 
instances, several students were in agreement that there was a problem In a 
sentence that was not intended to be problematic and so these sentences were 
rewritten* 



Stand'arda of Evaluation 




. The materials were then subjected to a second screening test with a '^^^ 

different group of 20 undergraduates who w.ere presented with versions of the 

passages that had the critical word deleted (l/e,, the word that changed the 

passage frcm non-probl troatic to problematid) * The subjects were instructed to 

try to fill in the missing word by using the surrounding context/ Of concern 

was whether the context would constrain the word choices^ to the same extent . 

r ' , ' ' ^ ^ ^ 

for the. different problem types, since cjlffarenoes in deteotion could result 
^ ' ; , ' . ' ' / 

if the appropriate words were highly constrained for one type (and so sBbJects 

need not process the word very dteply), but not for others. However, 

contextual constraints were high for 'ill problem types, with subjects 

supplying the correct word or a reasonable substitution- 95-100$ of the time* ^ 

The final versions of the passages vere assembled into booklets* Each 

passag# was typed on a separate sheet of paper and was headed by a descriptive 

title. At the bottom of each page was a set of four scheraatic faces with 

expressions ranging from very happy (full smile) to ver^y sad (full frown)* 

These faces were used by %he children to indicate whether the passage was 

jproblematic, and if so, to what extent it af-fected comprehensibili ty (see 

Procefiure section), iSe same three passages appeared in first, second, and 

^ third positions in each booklet and contained a nonsehse word, an 

inconsistency, and a prior knowledge yiolatibn, respectively. These passages 

served as warm-up passages (i.e. although no feedback was provided, 

children's responses were not scbred), Thm remaining 12 passages were 

arranged in a oonstrainey, partially randomized, order. The constraint was 

that a particular problem type had to appear in a particular p^sltion&cr^ss 

booklets, but the passage contslnlng that problem was randomly selected from 



Standards of Evaluation 
8 



the'pOQlp For exaople* the first experimental passage was nohprobleraatiai for, 

' - ' '~ ^ ^ . « ^ * -. ^ 

any given child it eould be any ona of the three nonproblematie passages* The 

seeorfd passage always eontained a prior knowledge violations and it too oould 

■sj^ ■ ... ..... 

be any one of three , etc* ^ 

Proaedure ' * K 

All sUbjeots were seen individually* In order to preelude the possibli-ty 
of inadvertent bias, the experinaenter was "blind" as to the subject's reading 
reve.f during testing. At the beginning' of eaeh session, the dhildren Here 
given a booklet containing the passages and were asked to fill-in their names, 
ages, and dates of birth on the cover sheet, *At the bottom of the page wer^ 
four schematic faces that the children were asked to refer^ to while they 
listened, to the rnlevant portions of the instructions* . 

The instructions were presented on tape, recorded by the experimenter* 
Children in both eonditions were instructed that their task was to try to find 
the problems in some short passages dealing with things they learn about in * 
school, A problem was defined as "somethings that might confuse people or 
^ something that people might have trouble understanding*'! Children in the 
specific instruction condition were given further information about the exact 
nature of the problems and two examples of ea type. The examples of the 
internal inconsistencies and prior kn'oiiilege violations were the same as those 
used by Harkman and Gortn (1981); the nonsense word ^amples followed the same 
format. The terms used to describe the. three problem types werei 1) two *^ 



parts of the paragraph that don't make sense together; 2) things that aren't 

true; and 3) words that aren' t really words, , ^ 

^ -. ■ 



y ... ■■ 

. ..... — . . ■ . ■ .■ ■ . 11- 

o •■ - 

ERIC 



standards of Ev&lu&tion 

J 9r 



Children in both conditions were ins^UQted to underline anything they 
thought a problem as -they were reading. Then, when they finished, thay 
were to rate the compr.ehenslbi^i ty of the paseage by airoling.orie of the faees 
at the bottom of the page. If thty circled the face with the big smile, that 
m,eant the passage did not have any problems* and was easy to understand. The 
face with tKe small smi^^^raeant there roi^ht have been a little probleroi but 
' the passage was still 'aa^ to unders^tand. The face with the small frown meant 
there was a little p^objLem and thc^passage was a little hard to understand*. 
And the face with theobig frown niedjnt there was a big problem, whioh made the 
parage very hardito- iinderstand, (As it turned out, subjects rarely circled 
either of the frowning- faees; instead, they used the big smile to indicate no 
problem and^he small smile to indicater a problem. For this reason, the 
rating data provided by the f aees were not, subjected to any statistical 
analyses.) The' subjects were further informed that after they oireled a face, 
they, were to^ explain why they made that choice and also why they underlined 

any words or phrape's,' ^ . ' 

" ^ ^ ' ^. . - 

' Af-ter the instructions were presented, the chlj.dren were asked to turn to 
the firit "(practice) passage, and^ read it silently to themselves. Care was 
taken to ensure th^t the children understood^ the task by reminding them, if 
necessary-, to underline problCTatio information as soon as they encountered it 
and by pointing out any lack of correspondence between ratin'gs" and underlining 
(e.g^ fe the tham with the small smile was circled but no sentences were - 



unSerlined). The children were not given feedback as to whether or not they- 
had correctly 'identified the Intende^probleras. After completing the three " 
practi'ce passages, the children went qn to .the 12 experimenCal passages. For, 



Standards of Ivaluation 
10 

eaah passage, the children read it silently at their own pace, underlining any 
problematia saetionB as thay went along. When they finished reading the ^ 
passage, they circled ona of the faces and then axplpined why they had dona 

/ 

so. f 

Upon corapletiDn of the task, the children .ware asked a standard set of 
questions designed to reveal how they had interpreted any problems they did 
not report* Before asking each question, the^ experiCQanter turned to the 
appropriate passage and placed it in front of the child, who was encouraged to 
reread it to find the answer* If the child did not spontaneously indicata 
awareness of the problem after answeri^ - the^quastion , the ax per imenter asked , 
"Does that make sense to you?" If the child still did not indicate problem 
awariness, a second prepared question was asked* To illustrate, the questions 
tor the prior knowladge problem ambadded in the^sentance, "Thpy used sand from 
the traas to make many things" were as follows* What part of the tree did the 
settlers uf.a for making thinTs? and Does sand come from trees? The quastions 
were only a^Ked for those problems the child did not initially identify , so 
the number of questions asked varied from child to child* 

' All sessions were tape racorded and the tapes ware subsequently 
transcribed. The length of the sessions varied, averaging about 25 minutes' 
per child, with a range of 10 to 45 minutes. 
Scoring 

Problem Idantif ication .\ Responses were scored as problara identifications 
if the child underlined the target information and gave an adequate 
explanation of the nature of the problem* It was not necessary for the child 
to specify the problem type. So, for example, if a child underlined the 



13 



Standards of Evaluation 



n 



nonsense word brugens and then said, "I don't know what that word means," the 

response was scored as a correct identifioation. Similarly, if a child 

explained that she underlined the phraset '^ sand filoro_ jr^ees " because she didn't 

think you could get sand from trees, this was scored as correets There was 

never any question on the basis of the i^esponses to nonsense words and prior 

knowledge violations as to whether the child noticed the intended problem. 

However t there was occasionally some ambiguity in the initiel response to 

internal inconsistencies whieh necessitated further questioning* Host of the 

ahildren only underlined- the second target sentence of th^ inconsistency* and 

when aslced to explaiti why they underlined that sen tenceV ^cme did not mention 

the first target sentenoe* For example » if a child underlined Venus is much 

too gold for us to live there and explained only that he knew Venus wasn't too 

cold, it was necessary to a^fe him how he knew thats At this point, most of 

the ahildren identified the relev^K information in the first target sentence , 

However, on 2^ of the 32^ possible occasions (3 inconsistencies x 108 ^ 

subjects), the'children indicated that they knew something was not true ^ 

because, for example, they had learned it in school* ITieSe children were then 

asked if there was anything in the passage that supported this idea. If they 

mer>tioned the first target sentence, they were given credit for identifying 

f ~ - ^ 

tfie inconsistency, but if they did not, their responses were not scored as 

correct* Five percent oA the responses fell in this latter category^ 

Standard application . Once the response protocols were fully, transcribed, 

the children's responses to "non— problematic" segments of tLXt were coded as 

to the type of standard they revealed. The coding scheme used was that 

established by Baker (in press) and included s'even categories I the three 




^LeriJirds of Evaluetion 
12 



which were the main focus of the present study* lexical, external 
consistency, and internal consistencyi plus syntax* informational clarity and 
completeneas (e*g** "I^ey should tall you more about how you get mediQine from 
tree bark"); propositional cohesiveness (e.g.t "Does the word "it" refer to 
the desert or the weather?") and structural cohesiveness {e.g*t Here it says 
it's hot and then it says it again down here; why are they repeating 
themselves?"). It is important to note that no judgments were made as to 
whether the comments reflect valid problems 4 For example, if a subject 
misunderstood a particular sentence and so said that the follQWing sentence 
was inconsistent i this was scored as an application of the internal . 
consistency standard Just as if the two sentences were in fa.ct inconsistent* 

Tne protocol J ware scored by two independent judges who were able to 
classify all but 6| of the comments into one of the seven categories* 

Inter--judge agreement was high and the few discrepancies were resolved through 

^ __ 

discussion* However, the latter four standards were seldom used; only the 

informational completeness standard was reported by more than four children 

(14 children used this standard). Most of the data analyses therefore will 

focus only on the three raos^t commonly used standards 1 lexical, internal 

consistency and external consistency. 

Responses to follow-up questions . Children's responses were scored for 

the extent to which they revealed prablera awareness and/or resolution* A 

child received a score of 1 if he or she spontaneously mentioned, on first 

being questioned, that there was a problem. A score of 2 was awarded if the 

child reported the problem when asked, "Does that make sense to you?", and a 

score of 3 was given if the child acknowledged the problem when being asked 



3 



Standards of Evaluation 



13 



directly about the problematic information* Tne maximum score of ^ was given 
on the rare ocaasions when the child still failed to see the problem* 

Results 

This section of the paper is divided into three sections, Ihe first 
section presents the results for the problem identification task* The data 
analyses included both analysis of varianoe and multiple regression 
procedures • The second* most important section focuses on the application of 
the various standards throughout the entire testing session. Several 
different dependent measures will be examined: 1) frequency of use ; ~2) 
failures to use particular standards* 3) exclusive use of single standards; 4) 
number of different standards used; and 5) the relationship of standard use to 
problem identification. The third section presents the results of the 
analysis of the subjects* responses to the foll©w=up questions* 
Problem Identification 

The mean number of problems of each type identified by the children as a 
function of grade, reading proficiency, and instruction condition is presented 
in Table 1. Note first of all the low levels of identification of all- three 

S. r 

r 

Insert Table 1 about here 

problem types. These levels were affected , however, by each of the factors of 
Interest, as revealed by a mixfid-model analysis of variance* Sixth graders 
identified more problems than fourth graders (53% vs* 3^S), £(1,92)^4,4^1, £< 
,001 and better readers identified more problems than poorer re&dmrs (54$ vs- 
29J)» £(lt92)-34.19, £<*001* Children receiving specific information about 

16 



Standards of Evaluation 
1^ 

the nature of the problems identified mora problems than children told only 
generally that problems would be present (55S vs. 32$), FCl , 92)s26,81 , £<,001. 

Finally^ children identified more nonsense words than either prior knowledge 
violations or internal inconsistencies (53$ vs* 39$ and 38S, respectively) » 
F(2,l8M)-5,95, £^,001,^ 

Contrary to expectations, problem type did not interact with grade, 
needing proficiency, or instruction condition. However , ^th ere were reliable 
interactions of reading proficieney with_ age* F<1 , 92)^7. 31 , £<,or, and of 
reading profioiency with instruction condition, £(1 , 92)^^*36, jp<05* Although 
sixth graders identified more problems than fourth graders, this effect was 
largely attributable to children in the better reader group, ^e older better 
readers identified 5S$ of the probleias as compared, to the ^40% identified by 
the younger better readers. Among the p>oorer readers, the sixth graders 
identified 31$ of the problems and the fourth graders 25$, a nonsignlf iaant 
difference* Thus, although there is substantial improvement with development 
among children who are ef^^ctive readers, the less effective readers do not 
show significant gains* This pattern is oonsistent with the conventional 
wisdom that poorer readers tend to fall further behind as they go through 
school * 

Children receiving specific Instruotions identified mnore problems than 
those receiving general instructions, but this effect too was mediated by 
reading level. Among the better readers, problem identification went from 39% 
with general instructions to 70$ with specific instructions* The improvement 
for the poorer readers wa^ xjijch less substantial (23$ to 34$)f though still 
statistically reliable. Thus, the better .readers were much more successful at 



17 



Standards of Evaluation 
15 



adopting the experimenter^provided criteria for evaluating the passages. This 
suggests that the difficulty poorer readers experience in evaluating their 
understanding is not simply the result of tfieir not knowing what criteria 
to use. Nevertheless, the fact that ■fciey did show modest gains is 
encouraging; in fact, their identifioation rate under specific instructions 
was not significantly different from that observed for the better readers 
under general instructions. 

The analysis of variance was based on a dichotoraous classification of the 
children as being better or poorer readers. The reliable main effect of 
reading proficiency Indicates that the two groups did indeed differ. But it 
does not indicate how much of the variance in problem identification is 
attributable to reading proficiency. To answer this question, raultiple 
regression analyses were carried out using the subjects* actual stanine scores 
as predictor variables. A second predictor v, riable was the subject's age In 
years and months. Because instruction condition was a qualitative variable, 
the data v?ere analyzed separately foFlthe~two conditions. The total number of 
problems identified served as the dependent varriable. 

We will consider first tiie regression analysis for subjects in the 
specific instruction condition. The predictor variables were entered into the 
-sgression equation through a forward stepping algorithm. Stanine score was 
the first variable to enter the regression equation, (F to enter = 25.05) 
accounting for 331 of the variance (r*.57). The age variable entered the 
equation on the second step (£ to enter = 3.91). The multiple correlation was 
.62 and the combined proportion of the varianoe accounted for by the two 
variables was .38. The analysis indicates, then, that when subjects were 



IS 



16 



sptjeif ically instructed as to the types of problems they should seek,, reading 
profiGiency was a muGh strDnger prediotion of problem deteGticn than 
Ghrondlogioal age. 

The ragrassion equation for subjects receiving general instructions 
yielded a rather different aolution. thm first variable to enter the equation 
was age_C^_to iinter-^-^ldv^^ 18 percent of the variance 

(^^,^2). otanina score entered second (F to enter ^ 7.07) i accounting for an 
additlDnal 10^ of the variance. The multiple correlation was .52 and the 
multiple r^-square was ,28, This analysis indicates that age is a better 
predictor of problem identification than reading proficiency when subjects are 
lefi^t to select the standards on their own. However, the two variables 
together account for less of the variance in performance than they, do for 
subjects who received specific instructions. 
Application of Standards Throughout the Testing Session 

As noted earlier* the introduction of problems into passages is one way to 
assess readers' use of different standards of evaluation* However, the 
underlying assumption is that effective readers routinely apply certain 
standards to evaluate their understanding;; therefore, the use of these 
standards can be revealed through any evaluative comment made about the text. 
This section of the paper will present information gleaned from analysis of 
the complete res^nse protocols* 

Frequency of application . Table 2 presents the mean number of times 
children applietf the lexical^ ex ternal consistency » and internal consistency 
standards. l?iese data reflect standard application both In the service of 
Identifying the various problems and in comments about text that was not 



19 



itsnouras or evaluation 

intended to be " p^^oblema tio . Remember » no judgment was made as to whether the 
standard was used appropriately or Inappropriately from an adult perspective* 

Insert Table 2 about here 

Either situation would reveal that the child is evaluating her understanding 
with respect to a partioular standard. An analysis of varianee was" carried 
out^ with age, reading proficiency, and instruction conditon as between 
subjeats faetors and type of standard as a wi thin^subjeats factor. 

The analysis revealed that neither the main effects of age nor reading 
proficiency were reliable C£'s < 1,0) but the two factors interacted, 
F(l, 92)^5, 29, p<,05* This interaction differs from the age by proficiency 
interaction reported earlier for problem detection in the cell corresponding 
to fourth grade poorer readers. Children in this group applied the standards 
more frequently than either the fourth grade better readers or the sixth grade 
poorer readers (means ^ 2*8M vs. 2.18 and 2.24, respectively). In fact* their 
frequency of standard application did not differ from that of the sixth grade 
better readers (mean - 3*09)* There is a difference, of course* in the 
effectiveness with which the standards were applied* as the fourth grade poor 
readers identified fewest actual problems. The present data irfdicate, 
however, that their low levels of problem identification cannot be attributed 
to such factors as an unwillingness to criticize the material or admit 
ignorance. 

Children receiving specific instructions applied the three standards more 
frequently than those receiving general instructions , F(l , 92)=22, 96, £<,001, 



"1 



20 



A reliable grade by instruction Qondition interactiont F{1 ,92)53-92t p=*05, 
indicatad that fourth and sixth graders did not differ in the mean number of 
standard applications under speGifio inr»tr jJCtlona (3*^6 vs* 3^27)* but the 
older children spontaneously applied the standards more often under general 
instructions than did the younger (2,26 vs* 1,^7) • Reoall that grade did not 
interact with instruction condition in the analysis of problem identification; 
both fourth and sixth, graders showed conparabla irapnovements from general to 
^specific instructions, with the sixth graders better overall, ^ Thus, the 
^present data indicate that the fourth graders complied with task demands to 
use the standards of evaluation, but they were applied less effectively than 
were those used by the sixth graders. 

The three standards were applied with different frequency, £(2,184)^17*88, 
p<,001. The most frequently used standard was that of external consistency 
(mean ^3.63), next was the lexical standard (mean ^ 2. 74), and least frequently 
used was the internal consistency standard (mean ^ 1.44), All differences 
between means were statistically reliable* These figures indicate that the 
external consistency standard in partloular was applied in many situations 
other than those intended by the experimenter, (Recall that prior knowledge 
problems were detected at the same rate as iriGonsistencies ^ and both were less 
often Identified than nonsense words.) An interaction of standard type with 
reading proficiency shows an interesting crossover effect, F(2, 184)^3. 33. 
p<.05. Whereas the poorer readers used the lexical and external consistency 
standards more often than the better readers, they used the internal 
consistency standard dramatically less often. Thus, although the problem 
detection data did nat yield a reliable problem type by proficiency 



21 



19 ^ 

Interaction* the present data do indicate that poorer readers are muGh less 
likely evaluate text for internal Gonsisteney than they are to evaluate for 
either external consistency or word understanding. 

The frequency with which the different types of standards were used also 
varied with the nature of the instructions, £(2,184)^^,84, £<,01. Children 
specifically instructed to apply the standards used the external consistency 
and internal consistency standards more than twice as often as children who 
received only general instructions. The data also indicate that children are 
more likely to adopt externai consisteucy and lexical standards when required 
to select their own criteria for evaluating text than they are to adopt 
internal consistency standards* 

Failures to use particular standards ^ Additional information about 
differences in children's standard use was obtained by classifying the 
subjects as to whether they ever used a specific standard or not. The data 
base for this classification was again the total number of standards used, not 
simply those involved in identifying a problem. The classification is lenient 
in that it is. based on the assumption that a single instance of standard use 
indicates that the standard Is available in the child's repertoire* Table 3 
shows the proportion of subjects who never used a specifiad standard. Visual 
inspection of the table makes it quite clear that these propor|tiond differed 
considerably across cells. In order to examine these differences more 

Insert Table 3 Abou^t^ Here 
systematically, separate multiway frequency tables were created for each of 
the three types of standards and tests of association were, carried out using a 
log-linear model , . ■ ^ 



standards cf Evaluation 
20 



Let us aonsider the lexical standard first. Overall, the proportion of 

children who never used the standard is small, (•IB) as one would expect given 

the relatively good identification of nonsense words* Tests of association 

revealed that more subjects the general instruction condition never used 

the Standard than subjects in the specific (^"=7*75, £<.01) However, an 

Interaction with reading proficiency showed that poorer readers were less 

likely to use the standard under specific irs true tions , (X^^S.Ol, p<.Q5), 

Additionally, a grade by reading proficiency interaction indicated that a 

substantial proportion of sixth grade /poorer readers never used the standard, 
2 

(X ^3*75, £^,05)* This latter finding is Intriguing because it suggests that 

poorer readers become less willing to acknowledge word level comprehension - 

problems as they grow oldert having learned, perhaps, that there is a- stigma 

associated .with such admissions* Even when specifically told that nonsense 

words would be present, close to a third of the children failed to identify a 

Single word as problematic* - 

Consider now the cell frequencies for children who never used an externel 

consiatency standard. Tests of association revealed a very strong effect of 

Instruction (X ^23 •71? £<.001)j not surprisingly, more subjects failed to use 

the standard if they were not specifically told to evaluate for external 

consistency; In addition i there were many more fourth graders who never 

2 

-adopted the standard than sixth graders CX =14,35, £<*001)* None of the other 
tests of association were reliable. In particular, better readers were no ' 
more or less likely to use the standard than poorer readers. > 

* With respect to the internal consistency standard, we find that grade, 
reading proficiency, and instructions all influence the likelihood of s 

23 



21 



adoption. More children used the standord under specific instructions than 
general, as one would expect CX-^li4,10, p<,001). More bettei^^aders. used the 
standard than poorer readers (X-^14,38. p<,001); and more older children used 
it than younger (X-^7^11 * p<.001). None of the higher order interactions were 
reliable* 

ftnally, some comments about Table 3 as a whole can be made on the basis 
of visual inspection. First, note that among the sixth grade better readers 
who received specific instructions, there vjcis not a single instance of failure 
to ^ppiy any standard at least once. No other groups showed this pc.ttern* 
Among the fourth grade better readers witn specific instruction, only tne 
lexiciil standard was applied by all- students. Note also the diff^renre^ 
across stand4.rds. Overall, 18% of the chilaren never used a lexical standard. 
33S never used an external consistency standard, and a full H5% never used the 
internal consistency standard* The relative ordering suggests that lexical 
standards are more likely to be adopted than external consistency standards, 
which in turn are more likely to be adopted than internal consistency 
standards* 

Ex clusive use of a single standard . The proportion of children who used 
only one particular standard of evaluation throughout the testing session is 
shown in Table ^. With the exception of one child, the internal consistency 
standard was never used exclusively. This is consistent with the view that it 
is a relatively more sophisticated standard and hence is unlikeiy to b^ the 
only one available in a child's repertoire. The external cdnsistfency standard 
similarly was rarely used exclusively, Fewen than b% of the children aia so, 
all of whom, interestingly, were less effective readers* The lexical 

' ' 24 



^tan.ibrds of Evaluation 
22 



standard 4 in Gontrast, was used exclusively by a substsntial proportion (-28) 
of the subjects- The indidenGe differed considerably across cells and so tests 

Insert Table ^ About Here 
of association using a log-linea^ model were carried out. Results revealed ' 
that fexclusive use of the lexical standard was more frequent in the general 

instruction condition than the speQiflc CX- 10.25* £<*001), that it was more 

2s _ 

frequent among poorer readers th&n better (X 6,^5, p^<,01; and that it was 

2 

more frequent arnong the fourth graders than the sixth graders (X"^b,04* 
p<^01). None of the higner order interactions showed significanu 
assoei stion^* Thus* these findings suggest tnat younger and poorer . readers 
are more liKely to evaluate their understanding at the word level only, an 
outcome consisteht with other studies suggesting over=relianee on a lexiaal 
standard (e.g-t Garner* 1981). - ^, ^ 

Numb er o f d i f f er en t s t a n d ar d s a ppl 1 ed . An indication of the variety of 
standards in a child's repertoire is provided by analysis of the number of 
different standards used. For this analysis we again consider all of the 
responses the children made, classified as to the type of standard they 
revealed. The Goding sGhenie identifies a maximum of seven different^ 
standards; in actuality, no chAld used^more than five, T^le 5 shows the mean 
number of, different standards used by the ohildren a^ a function of grade, 
reading proficiency and instruction Gondition, An=analysis of variance with 
these three between-sudjects factors revekled reliable main effects of each, 
as well as a grade by instructio^ InteraGtion* Six tlj^graaers used n^re 
different standards than the fourth graders (2.5 vs 2,0)', F(l , 9a)^8, 58, ^<*01. 
Better readers used more different standards than poorer readers (2,57 vs 



.Standards of Evaluation 

23 



- 1.9i), F(l *92)sl^* £<,.001, And children receiving specific instructions 
not surprisingly used more different standards than those not given specific 
instructions (1^.72 vs 1.79). £(1*92)^^0^9* pCOOl* Finally* the grade^by 
instruc^tion condition interaction^ F(l * 92)^5, 6^ * p^<.d5, reflects the' fact that 
fourth graaers who received specific Instructions did not differ reliably from^ 
the sixth graders who received specific instruction; however, fourth graders 
who received general ins'truc tions spontaneously adopted fewer Qifferenjt: 
standards than did their sixth grade counterparts. 

Insert Table 5 About Hera 

t 

Relat ionship of standard use" to problem dgt^ation . It has been assumed 
^ that any comments reflecting the use of a particular standard indicate .that 
that standard is in the child's repertoire and that she can use it 
effectively. If this is true, then use of a partrcular standard should be 
accompanied by detection of at least some of the corresponding problems. If a 
child used a standard but did not detect any of the problems, this could 
indicate that the child was simply responding to demand characteristics of the- 
task, The^ proper tion of children who us^ed particular ^standt^Td ^t least once 
but did not identify any of the corresponding problems was calculated. In 
almost half of the 2^ cells, the proportion was 0, and in all but two » the 
proportion was ,1. or less , indicating that the standards typically were usee, 
t! productively, ^e remaining two cells correspond to the use^of the external 
consistency standard by poorer readers receiving specific .instructiona . 
Overall* '42S of the fturth graders and kQ% of the sl%th graders , in these 
groups challenged" the truth of varioua passage statements but did not Identify 
any of the prior knowledge violations. This patternt which serves to explain 



26 



ERIC 



StandDrds of Evalustion 
2^ 



the disarepancy mentioned earlier between problem detection and standard use, 
indioatas that ttiese less sucaessful readers ottempted to compiy with the task 
demands but aould not apply the external consistency standard effwtively 
enough to identify the intended problems. 
Responses to FqIIow-Up Questions 

Ciiildren's responses to the questionB they w^re asked ab^ut missed 
■problems were scored as described in tne Method section. Each child's mean 
score, averaged over problem types, was entered into an anBlysis of variance 
with grade, reading profiolency, and instruction condition as betwe^-subjeots 
factors. The only reliable effect£^ were for grade. £(1 . 83)=16 . 69. E<.001 and 
reading proficiency £(1,83)^11.98. p<.001.2 It made no difference whether the 
initial instructions had been general or specific. Sixth grade^rs had lower 
scores than fourth graders. (1.77 vs 2.10), indicating that they perceived the 
nature of the problems more quickly. Similarly better readers had lower 
scores than poorer readers (1.79 vs 2.07). 

The fact that the younger and poorer readers had trouble perceiving the 
nature of the problems even when they were directly confronted with the ' 
relevant information is consistent with results reported by Garnm-. and Taylor 
(1^82). It suggests that problem identif ioation is influenced by factors 
otner than reading experience end proficiency. For example, logical reasoning 
skills seem to play an important role in the detection of internal , % 
inconsistencies. The main effect of grade implicates developmental 
differences in these skills , while the effect of reading level probably 
reflects the effect of general intelligence. 



27 



standards of Evaluation 



Discussion 

Tne present study has provided a number of importfeint insights into the 
ways chilaren evaluate their understanding dS they read, ftithougn previous 
studies have provided some evidence regarding cniidren's use of specific 
oriteria. none have focused on multiple stfendards. Moreover, the results have 
typicaXly been interpreted as though comprehension monitoring were a global^ 
entity, something at which a child is either effective or ineffective. This 
simplistic Qonoeption of comprehension monitoring must be abandoned if we are 
to effect any changes in children's ability to decide for themselves whether 
or not they understand. 

The present study shows quite clearly that there^re children who in fact 
are limited in their evaluati-n skills. These limitaWons are reflected in 
several different dimensions of the data. Consider, for example, children's 
identification of the in ten tionally^introduced problems. Pdorer readers were 
letfs successful at identifying the problems than better readers, consistent 
with several other studies (e.g., Garner & Kraus. 19Sl^B2j Paris ^ Myers. 
19fal). Additionally, younger cnildren were less sUocessful than older 
children, but the age-related change was found only among the better readers. 
Tne fact that the older poorer readers identified no more problems than^ the 
younger poorer readers has important implications. One interpretation of the 
findding is not that the older students 4*ail l3o .improve in their ability to 
evaluate their understanding, as the results might' su^ggest, but rather that 
they exhibit an increasing lack of ^confidence or Incentive to do so. Some 
support for this ^pothesis is provided ^by the finding that all of the poorer 
readers benefitted from instructions specifying' the types of problems they 



Standards oF Evaluition 
26 

shouid seeK, even though the benefit was riot as great as for tnt oettfer 
readers. But ^ more telling argument is that a substantial proportion of trie 
older poorer rtadfers never used the lexical atandard of evaluation, even 
though this was the standard most likely to be adopted by the majority of the 
ahildren* Whether this apparent reluetance to admit word comprehension 
failures characteriEes the way the students typieally respond in^ternally to 
the demands of reading or whether it only occurs externally in interactions 
with other people is an important empirioal question. 

Differences among the children were also apparent in the size and 
composition of their repertoire of standards* Better readers used more 
different standards than poorer rmadGrSt regardless of whether they were 
instructed as to the kinds of standards they should use. This suggests that 
they routinely evaluate their understanding with respect to more different 
criteria than the poorer readers* Additionally, although fourth graders who 
received specific instructions did not differ from sixth graders in the number 
of standards used, fourth graders who were left to adopt whatever criteria 
they aftose had a mora limited repertoire tnan their sixth grade counterparts* 
Note in par tic -I ar that the fourth grade poorer* rBaderB used an average of 
only 1*02 different standards under general instructions. In other words, 
they tended to rely exclusively on a single standard. 

Among those children who used a single standard* it was virtually always J 
the lexical standard that was adopted. ThiB finding is consistent with the 
of terf-repor ted emphasis on word understanding among younger and poorer readers 
(e*g*. Garner, 1981; Myers & Paris, 1978)* Even when specif ically told to use 
other standards, close to 25% of the younger readers did not* Recall that all 



29 



atandurds of Evaluation 



27 



children did in f^ct Know that the passages contained problems that w^re 
defined as things that might confuse people or that they might h^ve trouble 
unaerstanding * The ahildren did not seem to realize that there were other 
possible sources of Qompreherision difficulty* It seems, then, that the way 
many ehildren typically dertffr^whether or not they understand is by checking 
to make sure individual word meanings are known. 



Despite this higher incidence of reliance on the lexical standard among 
younger and poorer readers, there were also many children in these same groups 
who never used the standard. As noted earlier, this pattern was most 
pronounced among the sixth grade poorer readers* Quite clearly, there are 
individual differences in the standards used by less effective readers* 
Failur^^s to question word understanding at all may be Just as detrimental as 
failures to consider anything but word understanding. Although a number of 
good readers also never used the lexical standard, responses to the follow-up 
questions revealed a different pattern of dealing with the nonsense words. 
The better readers tended to have figured out during reading a plausible 
meaning for the nonsense words on the basis of surrounding contexts while the. 
poorer readers, even at the time of questioning « had difficulty coming up with 
a plausible meaning- a 

Children's use of the external consistency standard also varied with age 
and reading proficiency. Exclusive use of the standard was rare, but it was 
somewhat more common among poorer readers. Among' the children who never used 
the standard, there were more fourth graders than sixth graders. Although 
poorer readers were no less likely to use the standard than better readers, 
they tended to use it more frequently and less effectively. Note in 



30 



ERIC 



particular that the younger poorer readers who rfeceived speeifie instructions 
challenged the truth of 8.58 propositions on the average yet the meBn number 
of prior knowlege problems they identlfed was only 1.11. Horaver, many poorer 
readers who used the standard failed to identify any of the ernbedded 
ppobleius . 

The internal consistency standard was present In only 551 of the subjects' 
repertoires; in other words '45% of the ehidlren never questioned the 
consistency of any of the ideas within the pasaages. More younger and poorer 
readers fell into this grouping, as did those receiving general instructions. 
The fact that so many children never used the standard at all accounts for the 
low detection of internal inconsistencies m the prestnt study «rid it also 
suggests an explanation for the poor inconsistency detection reported in other 
studies (e.g.. Garner, L^&l^^MrnsrL & Kraus, mi'^Z}^^arma^^^ 
children do not think to evaluate their understanding with respect to internal 
consistency, and even when they are instructed to adopt such a standard, they 
still do not use it frequently, let alone effectively. Consider, for example, 
the fourth grade poorer readers: on the average, they used the internal 
consistency standard less than once throughout the entire session. Evaluation 
of internal consistency requires oareful prqcessing of the text. In contrast 
to the external standard, its use cannot be "faked," as evidenced by the fact 
that only two of the 108 subjects used the standard at least once but did not 
identify any inconsistencies. 

Although the priojary focus of the study was on children's use of three 
specific standards for evaluating their understanding, the study provides some 
evidence of the use of other standards as well. Tne most frequently us#d 

31 



ERIC 



non-target standard was informational completeness and clarity. A total of lU 
children used the standard at least once. 11 of whom were more profialent 
readers. Speeifieity of the instructions did not influence Its use. Although 
the proportion of children using the standard wss small, the reading 
proficiency difference suggests that better readers are more likely to 
spontaneously consider whether the text contains sufficient information to 
enable them to grasp the main Ideas. Coraments indicating that other standards 
had been applied were more infrequent, probably because the passages had been 
aoreentd by^adult readers for the presence of other problems, rour cniidren 
used the struatural eohesiveness standard, one the propositlonal eoheslveness 
standard, and five the syntactic. Additional research, is needed to examine 
more directly the extent to which children evaluate their understanding with 
Jie spect to t h^ae^tier^tandards • a. If we wish to Improve readers' 
abilities to decide for themselves when they understand and when they do not, 
we must have a thorough understanding of the kinds of criteria they do and do 



not use. 



ERIC 



32 



standards ol" Evaluation 

"J 

Reference Hotes 

ker * L, Children's effeatlvt. use of multiple standards "for^ evaluating 
their comprehension . Unpublished manuscript* University of Maryland^ 
Baltimore County * 1983* 



standards of h'valuation 
31 



References 

Baker, L, How do we know when we don't understand? Standards for evaluating 
text Qomprehension. in D,L, Forrest, G,E, HackinnQn,-& T.G* Waller CEas.). 
MetacQgnition, eognltion, and hunlan performance . New York: Aeademic 
Preas, in press. 

Garner, R, Monitoring of understanding: An inves tiga tion of good and poor 

readers' awareness of induced miscomprehension of text. Journal of Reading 

Sehavior , 19S0, 12, 55-64. 
Garner , H. Monitoring of passage inQonsisteney among poor comprehenders : a 

preliminary test of the "piecemeal processing" explanation, Journal of 

Educational Research , 1981, 159-162, 
Garner, R,, & Kraus, Monitoring of understanding among seventn graders. 



an inveatigci tion of good comprehend er-poor comprehender differences in 
knowing and regulating reading behaviors. Educational Research Quarterly , 
1932, 6, 5-12.^ / ' 

Garner, R, , ^ Taylor, N, Honitoring of understanding: an investigation of 
attentional assistance needs at different grade and reading proficiency 
levels, Reading Psychology , 19ua, 5^, 1-6, • 

Orice, H.P. Logic und conversation. In P, Cole & J,L. Morgan (Eds.) Syntax 
and semantics ( Vol 7): Speech Acts , hew Yorki Aoademic Pr€:ss, 1975, 

Harris, P,L., Kruithof, A,, Terwogt, H., & Visser, T, Children's detection 
and awareness of textual anomaly. Journal of Experimental Child 
Psychology , 1981, 31. 212-230. ^ ^ 

Harris, A. J,, Sipay, E, R, How to Increase reading ability . Nfw York: 
Longman, 1980* 

. 34 ' 



ERIC 



Standards of Evalustlon 



32 



Markmdn , . E.H, Realizing you don't understand: Elementary, school children's 

aw-aren^ss of inconsistencies. Cnlld Davelopm&nt t 1^79, 5£, 6^3'-6S5, 
Markmefi, rU Comprfenension monitoring. In W, P. Dickson (Ed*), Cnildren^s 

oral communi cation skills- . New York* Academic Frass, 19Bi* 
MarKman, E,IU, & Gorin, L. Cuilaren's abiiity to adjust their stanu^rds ror 

evaluating comprehension. Journal of Educational Psychology , lydl, 73 , 

320-325 • 

Paris* S.G, , ^ Myers* M, Comprehension monitoring, memory, and study 

strategies of good and poor readers. Journal of Reading Behayi^Dr , 1981, 

Spiro, R,, Bruce, B* C*, li Brewer, V/, F. (Eds* ) ^TheQretical issues reading 
comprehension . Hillsdale, N.J.i Erlbaum; 1980* 



ERIC 



standards of Evaluation 
33 



FoQtnotfcS 

Tne research repo'^ted m this paper wss support^a m p^rt py the National 
institutfc Qi' Education under Gr^nt iaE^U-Bl=^OiOQ. 1 am gratfefui to the 
studtints land staff of the Baltimore County Public Schools for their 
cooperation and panticipa tion.^ I thank Susan Sonnenschein for her comments on 
the manuscript. Address reprint requests to the author at the Department of 
Psychology. UMBC, Catonsville^ Maryland 21228. 

. 1* Since there were only three examples of each problem type, it is., 
important to know the extent of variablity among items, even though an 
analysis treating items as a random effect is inappropriate. Therefor ©7- the 
frequency with which each specific problem was identified was compared to the 
frequencies for other problems of the same ^type and different types. Despite 
Bomm variability,^ ^11 of the nonsense word problems were identified more often 
than any of the prior knowledge or internal inconsistency problems (p ^^7, 
,^9, .60)* All of tne prior Knowledge problems had similar identification 
rates (p ^ ,J6, .37, .Mi) as did the internal inconsistencies (,3i, .33. *Mi). 
Addition^llyt within conditions, tnere were no specific items whicn had 
detection rates grossly different from th- overall pattern, 

2, Since the means were based only on missed problems* a child who did not 
miss any -problems would have no score* Seven children fell- into this _ 
category* In addition, the taped protocols for three of the children were 
Incomplete and so their scores could not be calculated* This reduction in 
sample size aecounts for the reduced degrees of freedom, ^ 



standards of Evaluation 



Table 1 

Mean I^umber of InteLiLionally-In Produced Probiems Idfcntifi«d 



Nonsense 
Word 



Grade Reading Instruotion 

Proficienay 

Fourth Better Specific 2.13 

General ,98 

Specifie 1*1^ 

General 1 

Speeific 2.56 

General 1,98 

Poorer SpeQific 1,10 

General 1.09 
Note— Maximum in each cell is 



Poorer 



Sixth Better 



Type of Problem 
Prior Knowledge 
Violation 



1 ,11 
.29 
2.25 
1.5^ 
1.00 
• 96 



Internal 
InGonsistaney 



1.57 
.62 
*90 
,1^ 
2.55 
l.i42 
1,20 
.34 



j- 



TMble ^ 

Mean Kumber of Times Standards W^re Applied 
Throughout Testing Session 



Type of Standard 



Reading 

Grade Proriciency Instructions Lexioal 



i^ourth Better 



Poorer 



Sixth tiatter 



Poorer 



Specific 
General 
Speclfie 
Gen^eral 
Specific 
^CenerDl 
Specific 
General 



3.00 
1.69 

2,914 

6u 
2.17 



External 
Consistency 
^.20 
1.25 
a. 5b 
1.18 
14.94 
J. 25 
i\.lQ 
3. '42 



Internal 
Consfsteney 
1.93 
1.19 

.18 

1.75 
1.40 

.33 



38 



ERIC 



Table 3 

Proportion of Children Who Never Used 
' A Particular Standard of Evaluation 



Reading 
Grade ProfiGiency 
Fourth Better 

Poorer 

I 

Six til ^Batter 
Poor er 



Type of. Standard 
External Internal 



InstruQ tions 

Specific 

Gentir^l 

Speoif ic 

General 

Speoif ic 

General 

Specific 

General 



Lexical 
.00 

• id 
.Ub 
.lb 
.00 
.13 
.30 

• 33 



Consistency 
*20 
.69 
.25 
.02 
.00 
•.25 
VIO 
.33 



Consistency 
.2Y 
,56 
. .5U 

.00 
.31 
.t»0 

.67 ■ 



39 



Table il 

Proportion of Children Who Only Used 
One Particular Standard of Evaluation 



Keading 
Grada ProfiGienQy 
Fourth Better 

'poorer 

Sixth Better 

Poor er 



Instructions 

Speeific 

Gfenttral 

Specific 

General 

Specific 

General 

Specific 

General 



Lexical 
.20 
ol 

' .75 
,00 



.1/ 



Type of Standard 
Ex~ternar~~^ ^ ^^nt e f n 
Consis :ehcy Consistency 



^10 
.^2 



.00 
.00 
,08 
,09 
.00 
• 00 
.20 
.08 



,00 

. 06 

.00 

.00 

^00 . 

.00 f 

.00 

.00 



40 



standards of Evaluation 
' 38 



Table 5 



Reading Proficienay 
Better . 

Poorer 



Mean Number of Different Standards Used 
TnroughDUt Testing Session 

Instructions Fourth Sixth 

Specifie 2*od i*13 

General 1*67 2*62 

Specific 2*^6 2,40 

General 1,02 1*65 



41 

ERIC 



